A Bayesian Model for Calibrating Reviewer Scores
نویسندگان
چکیده
A typical technical conference involves each submission being reviewed by several reviewers of different expertise, preference of technique, and attention to detail. Furthermore, some reviewers have a tendency to give high scores, some reviewers low. These are detrimental for the overall quality of the reviewing. Can we estimate the reviewer bias scientifically? Here we first review a method for calibrating review scores used by NIPS (20062012), and present a revised Bayesian model that is used in NIPS 2013, 2014. We also investigate the potential of improving the calibration performance by incorporating extra information like review confidence factors.
منابع مشابه
Calibration of machine scores for pronunciation grading
Our proposed paradigm for automatic assessment of pronunciation quality uses hidden Markov models (HMMs) to generate phonetic segmentations of the student’s speech. From these segmentations, we use the HMMs to obtain spectral match and duration scores. In this work we focus on the problem of calibrating different machine scores to obtain an accurate prediction of the grades that a human expert ...
متن کاملHow to Calibrate the Scores of Biased Reviewers by Quadratic Programming
Peer reviewing is the key ingredient of evaluating the quality of scientific work. Based on the review scores assigned by the individual reviewers to the submissions, program committees of conferences and journal editors decide which papers to accept for publication and which to reject. However, some reviewers may be more rigorous than others, they may be biased one way or the other, and they o...
متن کاملA hierarchical Bayesian model for calibrating estimates of species divergence times.
In Bayesian divergence time estimation methods, incorporating calibrating information from the fossil record is commonly done by assigning prior densities to ancestral nodes in the tree. Calibration prior densities are typically parametric distributions offset by minimum age estimates provided by the fossil record. Specification of the parameters of calibration densities requires the user to qu...
متن کاملThe Diversity of Model Tuning Practices in Climate Science
Abstract. Many examples of calibration in climate science raise no alarms regarding model reliability. We examine one example and show that, in employing Classical Hypothesis-testing, it involves calibrating a base model against Many examples of calibration in climate science raise no alarms regarding model reliability. We examine one example and show that, in employing Classical Hypothesis-tes...
متن کاملAccounting for Peer Reviewer Bias with Bayesian Models
Instructors and researchers of peer review would benefit from a consistent way of characterizing peer review among students. One factor that can affect peer review is reviewer bias. For example, students may give biased assessments if some reviewers are lenient and others stringent. Accordingly, statistical models of peer review should account for reviewer bias. We present work in progress comp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015